Open Access
April 2004 Least angle regression
Bradley Efron, Trevor Hastie, Iain Johnstone, Robert Tibshirani
Author Affiliations +
Ann. Statist. 32(2): 407-499 (April 2004). DOI: 10.1214/009053604000000067

Abstract

The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm implements the Lasso, an attractive version of ordinary least squares that constrains the sum of the absolute regression coefficients; the LARS modification calculates all possible Lasso estimates for a given problem, using an order of magnitude less computer time than previous methods. (2) A different LARS modification efficiently implements Forward Stagewise linear regression, another promising new model selection method; this connection explains the similar numerical results previously observed for the Lasso and Stagewise, and helps us understand the properties of both methods, which are seen as constrained versions of the simpler LARS algorithm. (3) A simple approximation for the degrees of freedom of a LARS estimate is available, from which we derive a Cp estimate of prediction error; this allows a principled choice among the range of possible LARS estimates. LARS and its variants are computationally efficient: the paper describes a publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates.

Citation

Download Citation

Bradley Efron. Trevor Hastie. Iain Johnstone. Robert Tibshirani. "Least angle regression." Ann. Statist. 32 (2) 407 - 499, April 2004. https://doi.org/10.1214/009053604000000067

Information

Published: April 2004
First available in Project Euclid: 28 April 2004

zbMATH: 1091.62054
MathSciNet: MR2060166
Digital Object Identifier: 10.1214/009053604000000067

Subjects:
Primary: 62J07

Keywords: boosting , coefficient paths , Lasso , Linear regression , Variable selection

Rights: Copyright © 2004 Institute of Mathematical Statistics

Vol.32 • No. 2 • April 2004
Back to Top